388 research outputs found

    Dropout Model Evaluation in MOOCs

    Full text link
    The field of learning analytics needs to adopt a more rigorous approach for predictive model evaluation that matches the complex practice of model-building. In this work, we present a procedure to statistically test hypotheses about model performance which goes beyond the state-of-the-practice in the community to analyze both algorithms and feature extraction methods from raw data. We apply this method to a series of algorithms and feature sets derived from a large sample of Massive Open Online Courses (MOOCs). While a complete comparison of all potential modeling approaches is beyond the scope of this paper, we show that this approach reveals a large gap in dropout prediction performance between forum-, assignment-, and clickstream-based feature extraction methods, where the latter is significantly better than the former two, which are in turn indistinguishable from one another. This work has methodological implications for evaluating predictive or AI-based models of student success, and practical implications for the design and targeting of at-risk student models and interventions

    Subgroup Robustness Grows On Trees: An Empirical Baseline Investigation

    Full text link
    Researchers have proposed many methods for fair and robust machine learning, but comprehensive empirical evaluation of their subgroup robustness is lacking. In this work, we address this gap in the context of tabular data, where sensitive subgroups are clearly-defined, real-world fairness problems abound, and prior works often do not compare to state-of-the-art tree-based models as baselines. We conduct an empirical comparison of several previously-proposed methods for fair and robust learning alongside state-of-the-art tree-based methods and other baselines. Via experiments with more than 340,000340{,}000 model configurations on eight datasets, we show that tree-based methods have strong subgroup robustness, even when compared to robustness- and fairness-enhancing methods. Moreover, the best tree-based models tend to show good performance over a range of metrics, while robust or group-fair models can show brittleness, with significant performance differences across different metrics for a fixed model. We also demonstrate that tree-based models show less sensitivity to hyperparameter configurations, and are less costly to train. Our work suggests that tree-based ensemble models make an effective baseline for tabular data, and are a sensible default when subgroup robustness is desired. For associated code and detailed results, see https://github.com/jpgard/subgroup-robustness-grows-on-trees .Comment: To appear at Neural Information Processing Systems (NeurIPS) 2022. Code at https://github.com/jpgard/subgroup-robustness-grows-on-tree

    VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use

    Full text link
    We introduce VisIT-Bench (Visual InsTruction Benchmark), a benchmark for evaluation of instruction-following vision-language models for real-world use. Our starting point is curating 70 'instruction families' that we envision instruction tuned vision-language models should be able to address. Extending beyond evaluations like VQAv2 and COCO, tasks range from basic recognition to game playing and creative generation. Following curation, our dataset comprises 592 test queries, each with a human-authored instruction-conditioned caption. These descriptions surface instruction-specific factors, e.g., for an instruction asking about the accessibility of a storefront for wheelchair users, the instruction-conditioned caption describes ramps/potential obstacles. These descriptions enable 1) collecting human-verified reference outputs for each instance; and 2) automatic evaluation of candidate multimodal generations using a text-only LLM, aligning with human judgment. We quantify quality gaps between models and references using both human and automatic evaluations; e.g., the top-performing instruction-following model wins against the GPT-4 reference in just 27% of the comparison. VisIT-Bench is dynamic to participate, practitioners simply submit their model's response on the project website; Data, code and leaderboard is available at visit-bench.github.io

    Influence of diabetes on ambulation and inflammation in men and women with symptomatic peripheral artery disease

    Get PDF
    AbstractObjectiveTo determine whether diabetes and sex were factors associated with ambulatory function, endothelial cell inflammation, oxidative stress, and apoptosis, and with circulating biomarkers of inflammation and antioxidant capacity in patients with peripheral artery disease (PAD) and claudication.Materials/MethodsAmbulatory function of 180 symptomatic men and women with PAD was assessed during a graded maximal treadmill test, 6-minute walk test, and 4-meter walk test. Patients were further characterized on endothelial effects of circulating factors present in the sera using a cell culture-based bioassay on primary human arterial endothelial cells, and on circulating inflammatory and vascular biomarkers.ResultsMen and women with diabetes had greater prevalence (p = 0.007 and p = 0.015, respectively) of coronary artery disease (CAD) than patients without diabetes. To assure that this difference did not influence planned comparisons, the data set was stratified on CAD. Diabetic men with CAD had a lower peak walking time (PWT) during the treadmill test and a slower 4-meter gait speed compared to non-diabetic men with CAD (p < 0.05). Diabetic women with CAD had a lower PWT compared to their non-diabetic counterparts (p < 0.01). Additionally, diabetic men with CAD had higher pigment epithelium-derived factor (p < 0.05) than their non-diabetic counterparts, and diabetic women with CAD had higher leptin (p < 0.01) and interleukin-8 levels (p < 0.05).ConclusionsIn patients with PAD, diabetic men and women with CAD had more severe claudication than their non-diabetic counterparts, as measured by shorter PWT, and the men had further ambulatory impairment manifested by slower 4-meter gait speed. Furthermore, the diabetic patients with CAD had elevations in interleukin-8, leptin, and PEDF

    Cachexia index for prognostication in surgical patients with locally advanced oesophageal or gastric cancer: multicentre cohort study

    Get PDF
    Background Features of cancer cachexia adversely influence patient outcomes, yet few currently inform clinical decision-making. This study assessed the value of the cachexia index (CXI), a novel prognostic marker, in patients for whom neoadjuvant chemotherapy and surgery for oesophagogastric cancer is planned. Methods Consecutive patients newly diagnosed with locally advanced (T3–4 or at least N1) oesophagogastric cancer between 1 January 2010 and 31 December 2015 were identified through the West of Scotland and South-East Scotland Cancer Networks. CXI was calculated as (L3 skeletal muscle index) × (serum albumin)/(neutrophil lymphocyte ratio). Sex-stratified cut-off values were determined based on the area under the curve (AUC), and patients were divided into groups with low or normal CXI. Primary outcomes were disease progression during neoadjuvant chemotherapy and overall survival (at least 5 years of follow-up). Results Overall, 385 patients (72% men, median age 66 years) were treated with neoadjuvant chemotherapy for oesophageal (274) or gastric (111) cancer across the study interval. Although patients with a low CXI (men: CXI below 52 (AUC 0.707); women: CXI below 41 (AUC 0.759)) were older with more co-morbidity, disease characteristics were comparable to those in patients with a normal CXI. Rates of disease progression during neoadjuvant chemotherapy, leading to inoperability, were higher in patients with a low CXI (28 versus 12%; adjusted OR 3.07, 95% c.i. 1.67 to 5.64; P &amp;lt; 0.001). Low CXI was associated with worsened postoperative mortality (P = 0.019) and decreased overall survival (median 14.9 versus 56.9 months; adjusted HR 1.85, 1.42 to 2.42; P &amp;lt; 0.001). Conclusion CXI is associated with disease progression, worse postoperative mortality, and overall survival, and could improve prognostication and decision-making in patients with locally advanced oesophagogastric cancer
    • …
    corecore